HARDFS: hardening HDFS with selective and lightweight versioning

نویسندگان

  • Thanh Do
  • Tyler Harter
  • Yingchao Liu
  • Haryadi S. Gunawi
  • Andrea C. Arpaci-Dusseau
  • Remzi H. Arpaci-Dusseau
چکیده

We harden the Hadoop Distributed File System (HDFS) against fail-silent (non fail-stop) behaviors that result from memory corruption and software bugs using a new approach: selective and lightweight versioning (SLEEVE). With this approach, actions performed by important subsystems of HDFS (e.g., namespace management) are checked by a second implementation of the subsystem that uses lightweight, approximate data structures. We show that HARDFS detects and recovers from a wide range of fail-silent behaviors caused by random bit flips, targeted corruptions, and real software bugs. In particular, HARDFS handles 90% of the fail-silent faults that result from random memory corruption and correctly detects and recovers from 100% of 78 targeted corruptions and 5 real-world bugs. Moreover, it recovers orders of magnitude faster than full reboot by using micro-recovery. The extra protection in HARDFS incurs minimal performance and space overheads.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mosquito: Another One Bites the Data Upload STream

Mosquito is a lightweight and adaptive physical design framework for Hadoop. Mosquito connects to existing data pipelines in Hadoop MapReduce and/or HDFS, observes the data, and creates better physical designs, i.e. indexes, as a byproduct. Our approach is minimally invasive, yet it allows users and developers to easily improve the runtime of Hadoop. We present three important use cases: first,...

متن کامل

Model-Driven Development of Versioning Systems: An Evaluation of Different Approaches

This paper analyzes the domain of versioning systems and compares three approaches to generating such systems from models. In the first approach, we define a domainspecific modeling language as a lightweight extension of UML and use templates to generate a middleware-based versioning system. In the second approach, we define a domain-specific data definition and manipulation language that can b...

متن کامل

Period of Grace: A New Paradigm for Efficient Soft Error Hardening

In late-age silicon, soft errors become an issue even for low-margin products. Since classical hardening techniques are associated with costs which may not be acceptable for such ICs, selective hardening which targets only a subset of all possible soft errors has been suggested. We propose a soft error selection method based on severity of an error’s impact on system behavior. Some soft errors ...

متن کامل

Lightweight Fault Tolerance in Large-Scale Distributed Graph Processing

The success of Google’s Pregel framework in distributed graph processing has inspired a surging interest in developing Pregel-like platforms featuring a user-friendly “think like a vertex” programming model. Existing Pregel-like systems support a fault tolerance mechanism called checkpointing, which periodically saves computation states as checkpoints to HDFS, so that when a failure happens, co...

متن کامل

A Versatile and User-Oriented Versioning File System

File versioning is a useful technique for recording a history of changes. Applications of versioning include backups and disaster recovery, as well as monitoring intruders’ activities. Alas, modern systems do not include an automatic and easy-to-use file versioning system. Existing backup solutions are slow and inflexible for users. Even worse, they often lack backups for the most recent day’s ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013